Goto

Collaborating Authors

 parallel support vector machine


Parallel Support Vector Machines: The Cascade SVM

Neural Information Processing Systems

We describe an algorithm for support vector machines (SVM) that can be parallelized efficiently and scales to very large problems with hundreds of thousands of training vectors. Instead of analyzing the whole training set in one optimization step, the data are split into subsets and optimized separately with multiple SVMs. The partial results are combined and filtered again in a'Cascade' of SVMs, until the global optimum is reached. The Cascade SVM can be spread over multiple processors with minimal communication overhead and requires far less memory, since the kernel matrices are much smaller than for a regular SVM. Convergence to the global optimum is guaranteed with multiple passes through the Cascade, but already a single pass provides good generalization.


MLPSVM:A new parallel support vector machine to multi-label learning

arXiv.org Machine Learning

Multi-label learning has attracted the attention of the machine learning community. The problem conversion method Binary Relevance converts a familiar single label into a multi-label algorithm. The binary relevance method is widely used because of its simple structure and efficient algorithm. But binary relevance does not consider the links between labels, making it cumbersome to handle some tasks. This paper proposes a multi-label learning algorithm that can also be used for single-label classification. It is based on standard support vector machines and changes the original single decision hyperplane into two parallel decision hyper-planes, which call multi-label parallel support vector machine (MLPSVM). At the end of the article, MLPSVM is compared with other multi-label learning algorithms. The experimental results show that the algorithm performs well on data sets.


Parallel Support Vector Machines: The Cascade SVM

Neural Information Processing Systems

We describe an algorithm for support vector machines (SVM) that can be parallelized efficiently and scales to very large problems with hundreds of thousands of training vectors. Instead of analyzing the whole training set in one optimization step, the data are split into subsets and optimized separately with multiple SVMs. The partial results are combined and filtered again in a'Cascade' of SVMs, until the global optimum is reached. The Cascade SVM can be spread over multiple processors with minimal communication overhead and requires far less memory, since the kernel matrices are much smaller than for a regular SVM. Convergence to the global optimum is guaranteed with multiple passes through the Cascade, but already a single pass provides good generalization. A single pass is 5x - 10x faster than a regular SVM for problems of 100,000 vectors when implemented on a single processor. Parallel implementations on a cluster of 16 processors were tested with over 1 million vectors (2-class problems), converging in a day or two, while a regular SVM never converged in over a week.